Discriminant Sub-Space Projection of Spectro-Temporal Speech Features Based on Maximizing Mutual Information

نویسندگان

Martin Heckmann

Claudius Gläser

چکیده

We previously developed noise robust Hierarchical SpectroTemporal (HIST) speech features. The learning of the features was performed in an unsupervised way with unlabeled speech data. In a final stage we deployed Principal Component Analysis (PCA) to reduce the feature dimensions and to diagonalize them. In this paper we investigate if a discriminant projection can further increase the performance. We maximize the mutual information between the features and the phoneme categories using a procedure known as Maximizing Renyi’s Mutual Information (MRMI) and also compare it to Linear Discriminant Analysis (LDA). Based on recognition tests in clean and in noise, i. e. in matching and mismatching conditions, we show that the discriminant projections increases recognition scores compared to PCA in matching conditions. However, this improvement does not transfer to the mismatching, i. e. noisy, conditions. We discuss measures to alleviate this problem. Overall MRMI performs better than LDA.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain

This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...

متن کامل

Maximum conditional mutual information projection for speech recognition

Linear discriminant analysis (LDA) in its original modelfree formulation is best suited to classification problems with equal-covariance classes. Heteroscedastic discriminant analysis (HDA) removes this equal covariance constraint, and therefore is more suitable for automatic speech recognition (ASR) systems. However, maximizing HDA objective function does not correspond directly to minimizing ...

متن کامل

Combining Feature Space Discriminative Training with Long-Term Spectro-Temporal Features for Noise-Robust Speech Recognition

Discriminative training of feature space using maximum mutual information (fMMI) objective function has been shown to yield remarkable accuracy improvements. For noisy environments, fMMI can be regarded as an effective noise compensation algorithm and can play a significant role for noise robustness. Feature space speaker adaptation techniques such as feature space maximum likelihood linear reg...

متن کامل

Optimal feature sub-space selection based on discriminant analysis

The performance of a speech recogniser, or of any other pattern classifier, strongly depends on the input features: to obtain a good performance, the feature set needs to be both highly discriminative and compact. Linear discriminant analysis (LDA) is a common data-driven method used to find linear transformations that map large feature vectors onto smaller ones while retaining most of the disc...

متن کامل

Minimum Classification Error Based Spectro-Temporal Feature Extraction for Robust Audio Classification

Mel-frequency cepstral coefficients (MFCCs) are the most popular features for automatic audio classification (AAC). However, MFCCs are often not robust in adverse environment. In this paper, a minimum classification error (MCE)-based method is proposed to extract new and robust spectro-temporal features as alternatives to MFCCs. The robustness of the proposed new features is evaluated on noisy ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Discriminant Sub-Space Projection of Spectro-Temporal Speech Features Based on Maximizing Mutual Information

نویسندگان

چکیده

منابع مشابه

Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain

Maximum conditional mutual information projection for speech recognition

Combining Feature Space Discriminative Training with Long-Term Spectro-Temporal Features for Noise-Robust Speech Recognition

Optimal feature sub-space selection based on discriminant analysis

Minimum Classification Error Based Spectro-Temporal Feature Extraction for Robust Audio Classification

عنوان ژورنال:

اشتراک گذاری